## ListingKey ListingNumber ListingCreationDate
## 1 1021339766868145413AB3B 193129 2007-08-26 19:09:29.263000000
## 2 10273602499503308B223C1 1209647 2014-02-27 08:28:07.900000000
## 3 0EE9337825851032864889A 81716 2007-01-05 15:00:47.090000000
## 4 0EF5356002482715299901A 658116 2012-10-22 11:02:35.010000000
## 5 0F023589499656230C5E3E2 909464 2013-09-14 18:38:39.097000000
## 6 0F05359734824199381F61D 1074836 2013-12-14 08:26:37.093000000
## CreditGrade Term LoanStatus ClosedDate BorrowerAPR BorrowerRate
## 1 C 36 Completed 2009-08-14 00:00:00 0.16516 0.1580
## 2 36 Current 0.12016 0.0920
## 3 HR 36 Completed 2009-12-17 00:00:00 0.28269 0.2750
## 4 36 Current 0.12528 0.0974
## 5 36 Current 0.24614 0.2085
## 6 60 Current 0.15425 0.1314
## LenderYield EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## 1 0.1380 NA NA NA
## 2 0.0820 0.07960 0.0249 0.05470
## 3 0.2400 NA NA NA
## 4 0.0874 0.08490 0.0249 0.06000
## 5 0.1985 0.18316 0.0925 0.09066
## 6 0.1214 0.11567 0.0449 0.07077
## ProsperRating..numeric. ProsperRating..Alpha. ProsperScore
## 1 NA NA
## 2 6 A 7
## 3 NA NA
## 4 6 A 9
## 5 3 D 4
## 6 5 B 10
## ListingCategory..numeric. BorrowerState Occupation EmploymentStatus
## 1 0 CO Other Self-employed
## 2 2 CO Professional Employed
## 3 0 GA Other Not available
## 4 16 GA Skilled Labor Employed
## 5 2 MN Executive Employed
## 6 1 NM Professional Employed
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## 1 2 True True
## 2 44 False False
## 3 NA False True
## 4 113 True False
## 5 44 True False
## 6 82 True False
## GroupKey DateCreditPulled
## 1 2007-08-26 18:41:46.780000000
## 2 2014-02-27 08:28:14
## 3 783C3371218786870A73D20 2007-01-02 14:09:10.060000000
## 4 2012-10-22 11:02:32
## 5 2013-09-14 18:38:44
## 6 2013-12-14 08:26:40
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## 1 640 659 2001-10-11 00:00:00
## 2 680 699 1996-03-18 00:00:00
## 3 480 499 2002-07-27 00:00:00
## 4 800 819 1983-02-28 00:00:00
## 5 680 699 2004-02-20 00:00:00
## 6 740 759 1973-03-01 00:00:00
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## 1 5 4 12
## 2 14 14 29
## 3 NA NA 3
## 4 5 5 29
## 5 19 19 49
## 6 21 17 49
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## 1 1 24 3
## 2 13 389 3
## 3 0 0 0
## 4 7 115 0
## 5 6 220 1
## 6 13 1410 0
## TotalInquiries CurrentDelinquencies AmountDelinquent
## 1 3 2 472
## 2 5 0 0
## 3 1 1 NA
## 4 1 4 10056
## 5 9 0 0
## 6 2 0 0
## DelinquenciesLast7Years PublicRecordsLast10Years
## 1 4 0
## 2 0 1
## 3 0 0
## 4 14 0
## 5 0 0
## 6 0 0
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## 1 0 0 0.00
## 2 0 3989 0.21
## 3 NA NA NA
## 4 0 1444 0.04
## 5 0 6193 0.81
## 6 0 62999 0.39
## AvailableBankcardCredit TotalTrades TradesNeverDelinquent..percentage.
## 1 1500 11 0.81
## 2 10266 29 1.00
## 3 NA NA NA
## 4 30754 26 0.76
## 5 695 39 0.95
## 6 86509 47 1.00
## TradesOpenedLast6Months DebtToIncomeRatio IncomeRange
## 1 0 0.17 $25,000-49,999
## 2 2 0.18 $50,000-74,999
## 3 NA 0.06 Not displayed
## 4 0 0.15 $25,000-49,999
## 5 2 0.26 $100,000+
## 6 0 0.36 $100,000+
## IncomeVerifiable StatedMonthlyIncome LoanKey
## 1 True 3083.333 E33A3400205839220442E84
## 2 True 6125.000 9E3B37071505919926B1D82
## 3 True 2083.333 6954337960046817851BCB2
## 4 True 2875.000 A0393664465886295619C51
## 5 True 9583.333 A180369302188889200689E
## 6 True 8333.333 C3D63702273952547E79520
## TotalProsperLoans TotalProsperPaymentsBilled OnTimeProsperPayments
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 1 11 11
## 6 NA NA NA
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## 1 NA NA
## 2 NA NA
## 3 NA NA
## 4 NA NA
## 5 0 0
## 6 NA NA
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## 1 NA NA
## 2 NA NA
## 3 NA NA
## 4 NA NA
## 5 11000 9947.9
## 6 NA NA
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## 1 NA 0
## 2 NA 0
## 3 NA 0
## 4 NA 0
## 5 NA 0
## 6 NA 0
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## 1 NA 78 19141
## 2 NA 0 134815
## 3 NA 86 6466
## 4 NA 16 77296
## 5 NA 6 102670
## 6 NA 3 123257
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## 1 9425 2007-09-12 00:00:00 Q3 2007
## 2 10000 2014-03-03 00:00:00 Q1 2014
## 3 3001 2007-01-17 00:00:00 Q1 2007
## 4 10000 2012-11-01 00:00:00 Q4 2012
## 5 15000 2013-09-20 00:00:00 Q3 2013
## 6 15000 2013-12-24 00:00:00 Q4 2013
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 1 1F3E3376408759268057EDA 330.43 11396.14
## 2 1D13370546739025387B2F4 318.93 0.00
## 3 5F7033715035555618FA612 123.32 4186.63
## 4 9ADE356069835475068C6D2 321.45 5143.20
## 5 36CE356043264555721F06C 563.97 2819.85
## 6 874A3701157341738DE458F 342.37 679.34
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## 1 9425.00 1971.14 -133.18
## 2 0.00 0.00 0.00
## 3 3001.00 1185.63 -24.20
## 4 4091.09 1052.11 -108.01
## 5 1563.22 1256.63 -60.27
## 6 351.89 327.45 -25.33
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## 1 0 1 0
## 2 0 1 0
## 3 0 1 0
## 4 0 1 0
## 5 0 1 0
## 6 0 1 0
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## 1 0 0 258
## 2 0 0 1
## 3 0 0 41
## 4 0 0 158
## 5 0 0 20
## 6 0 0 1
This dataset contains 113,937 loans with 81 variables; EDA will be perofromed for some variables as follows: Univariate Plots, Bivariate Plots, and Multivariate Plots. Prosper is a platform which is a good option for those who can’t get a loan from a traditional bank and don’t want the high interest rates offered by credit cards and payday loans. the process workflow based on the actors is: borrower: submit a loan application. prosper: provide loans after doing checks with some orgnizations to make sure the borrower meets several criteria.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1000 4000 6500 8337 12000 35000
from the histogram above the most loan amount is $4000 then $10000 and $15000. so the amount peaks is between $4000 and $14000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.0000 0.1340 0.1840 0.1928 0.2500 0.4975
from th histogram above the borrower rate is around 20% and the range is between 0.1 and 0.3 (10% to 30%)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.00 4.00 6.00 5.95 8.00 11.00 29084
from the bar chart Borrowers proposer’s scores are between 1.00 and 11.00 with median 6.00
##
## Cancelled Chargedoff Completed
## 5 11992 38074
## Current Defaulted FinalPaymentInProgress
## 56576 5018 205
## Past Due (>120 days) Past Due (1-15 days) Past Due (16-30 days)
## 16 806 265
## Past Due (31-60 days) Past Due (61-90 days) Past Due (91-120 days)
## 363 313 304
## Current Completed Chargedoff
## 56576 38074 0
## FinalPaymentInProgress Defaulted Cancelled
## 205 5018 5
## Past Due (1-15 days) Past Due (16-30 days) Past Due (31-60 days)
## 806 265 363
## Past Due (61-90 days) Past Due (91-120 days) Past Due (>120 days)
## 313 304 16
## NA's
## 11992
from the bar chart, it is clear that most loan data is for current loans then, completed loans and then chargedoff loans. we have some of cases which are paid after the due date.
## Employed Full-time Part-time Self-employed Retired
## 67322 26355 1088 6134 795
## Not employed Not available Other NA NA's
## 835 5347 3806 0 2255
from the bar chart above the most of borrowers employment status is employed then full-time
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 3200 4667 5608 6825 1750003
from the histogram above borrowers monthly income range is between 25,000$ and 75,000$
This dataset contains 113,937 loans with 81 variables. Prosper is a platform which is a good option for those who can’t get a loan from a traditional bank and don’t want the high interest rates offered by credit cards and payday loans. the process workflow based on the actors is: #borrower: submit a loan application. #prosper: provide loans after doing checks with some orgnizations to make sure the borrower meets several criteria.
The main features of interest in the dataset are BorrowerRate and ProsperScore. from the invstigation above there is a relashionship between them as the ProsperScore affect BorrowerRate
and LoanOriginalAmount might affect BorrowRate.
No
##
## Pearson's product-moment correlation
##
## data: loandata$BorrowerRate and loandata$ProsperScore
## t = -248.98, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.6536072 -0.6458311
## sample estimates:
## cor
## -0.6497361
from the chart above we observe that when BorrowerRate decreases ProsperScore increases. correlation coefficient is -0.65 BorrowerRate has strong relationship with ProsperScore.
##
## Pearson's product-moment correlation
##
## data: loandata$BorrowerRate and loandata$StatedMonthlyIncome
## t = -30.155, df = 113940, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.09473938 -0.08321827
## sample estimates:
## cor
## -0.0889818
from the chart above we observe tha when BorrowerRate decreases StatedMonthlyIncome increases. correlation coefficient is -0.33, so a BorrowerRate moderate relationship.
from the chart above we observe the mean of BorrowerRate for current and completed are less than late payemnet in gereneral (i.e. pastdue), defaulted and chargedoff
from the chart above we obeserve that the mean of BorrowerRate for not employed borrower is higher than others.
from the chart above we observe that current loan status is the highest StatedMonthlyIncome and late paymnet in gereneral(i.e. pastdue)loan has lower StatedMonthlyIncome.
the chart above presents that prosper score and how much borrower assured monthly payment for all loan status exculds Completed status
the chart above presents that prosper score and how much borrower assured monthly payment for all loan status exculds Current status
the chart above presents that prosper score and how much borrower assured monthly payment for all loan status exculds PastDue status
the chart above presents that prosper score and how much borrower assured monthly payment for all loan status exculds Chargedoff status
the chart above presents that prosper score and how much borrower assured monthly payment for all loan status exculds Defaulted status
the mean of BorrowerRate decreases while ProsperScore increases. correlation coefficient is -0.65 so BorrowerRate has strong relationship with ProsperScore.
the mean of BorrowerRate decreases while LoanOriginalAmount increases. correlation coefficient is -0.33, so BorrowerRate has moderate relationship with LoanOriginalAmount
the mean of BorrowerRate decreases while StatedMonthlyIncome increases. correlation coefficient is -0.088, so BorrowerRate has weak relationship. with StatedMonthlyIncome
the mean of BorrowerRate for current and completed loan status are less than late payment in gereneral (i.e. pastdue), defaulted and chargedoff. the mean of BorrowerRate for not employed borrower is higher than others.
relationships between LoanStatus, and StatedMonthlyIncome so: the current loan status is the highest StatedMonthlyIncome and late payemnet in gereneral (i.e. pastdue) loan has lower StatedMonthlyIncome.
the strongest relationship is between BorrowerRate and ProsperScore since when the mean of BorrowerRate decreases the ProsperScore increases. as well,between BorrowerRate and LoanOriginalAmount moderate relationship. so, BorrowerRate will be affaected by ProsperScore and LoanOriginalAmount.
from the chart above there are a lot of current and completed loans with lower BorrowerRate and higher ProsperScore
from the chart above we confirm that the mean of BorrowerRate decreases while ProsperScore increases for loan status such as current and completed.
from the chart above we confirm that the mean of BorrowerRate decreases while ProsperScore increases for employment status.
this is confirmed with some of LoanStatus such as completed and current.
interesting relationship between BorrowerRate,ProsperScore and LoanStatus were confirmed above.
The plot above describes BorrowRate for NumberOfLoans from ~ .05 to ~ .35 the peak is at ~ 0.15.
##
## Pearson's product-moment correlation
##
## data: loandata$BorrowerRate and loandata$ProsperScore
## t = -248.98, df = 84851, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.6536072 -0.6458311
## sample estimates:
## cor
## -0.6497361
when BorrowerRate decreases ProsperScore increases. correlation coefficient is -0.65 so BorrowerRate has strong relationship with ProsperScore.
the mean of BorrowerRate decreases while ProsperScore increases for loan status such as current and completed. ——
This dataset contains 113,937 loans with 81 variables; EDA has been perofromed for some variables such as BorrowerRate, ProsperScore, , LoanOriginalAmount, LoanStatus, StatedMonthlyIncome and EmploymentStatus. 1st, Univariate Plots for one variable. then, Bivariate Plots for two variables. Finally, Multivariate Plots for categorical and continuous variables I foucsed on BorrowerRate and its relationship with other variables. we found the following: the mean of BorrowerRate decreases while ProsperScore increases. BorrowerRate has strong relationship with ProsperScore.
the mean of BorrowerRate decreases while LoanOriginalAmount increases. BorrowerRate has moderate relationship with LoanOriginalAmount
the mean of BorrowerRate decreases while StatedMonthlyIncome increases. BorrowerRate has weak relationship with StatedMonthlyIncome
the mean of BorrowerRate for current and completed loan status are less than late payment in gereneral (i.e. pastdue), defaulted and chargedoff. the mean of BorrowerRate for not employed borrower is higher than others.
with lower BorrowerRate we will have higher ProsperScore and this is confirmed with some of LoanStatus such as completed and current.
for future work, I may explore more variables periodically to improve the process workflow